Where to Play: Retrieval of Video Segments using Natural-Language Queries
نویسندگان
چکیده
In this paper, we propose a new approach for retrieval of video segments using natural language queries. Unlike most previous approaches such as concept-based methods or rule-based structured models, the proposed method uses image captioning model to construct sentential queries for visual information. In detail, our approach exploits multiple captions generated by visual features in each image with ‘Densecap’. en, the similarities between captions of adjacent images are calculated, which is used to track semantically similar captions over multiple frames. Besides introducing this novel idea of ’tracking by captioning’, the proposed method is one of the rst approaches that uses a language generation model learned by neural networks to construct semantic query describing the relations and properties of visual information. To evaluate the eectiveness of our approach, we have created a new evaluation dataset, which contains about 348 segments of scenes in 20 movie-trailers. rough quantitative and qualitative evaluation, we show that our method is eective for retrieval of video segments using natural language queries.
منابع مشابه
Indexing an intelligent video database using evolutionary control
In this paper we present the implementation of an intelligent video database using evolutionary control. By using automatic video indexing techniques, the retrieval of video segments can be performed using free natural language queries. Retrieval of video segments from a database for editing and viewing is becoming an important topic in video processing. A cinematic movie consists of video segm...
متن کاملSpeech Recognition and Information Retrieval:
The Informedia Digital Video Library Project at Carnegie Mellon University is creating large digital libraries of video and audio data available for full content retrieval by integrating natural language understanding, image processing, speech recognition and information retrieval. These digital video libraries allow users to explore multi-media data in depth as well as in breadth. The Informed...
متن کاملGhent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Similarity using Named Entities
In this paper, we attempt to tackle the MediaEval 2012 Search and Hyperlinking challenge, which focuses on video segment retrieval from a large dataset, based on short natural language queries, as well as linking the resulting segments to related ones. Our approach makes use of three semantic similarity metrics, merged by applying late fusion.
متن کاملLarge-Scale Query-by-Image Video Retrieval Using Bloom Filters
We consider the problem of using image queries to retrieve videos from a database. Our focus is on large-scale applications, where it is infeasible to index each database video frame independently. Our main contribution is a framework based on Bloom filters, which can be used to index long video segments, enabling efficient image-to-video comparisons. Using this framework, we investigate severa...
متن کاملUTwente does Brave New Tasks for MediaEval 2012: Searching and Hyperlinking
In this paper we report our experiments and results for the brave new searching and hyperlinking tasks for the MediaEval Benchmark Initiative 2012. The searching task involves finding target video segments based on a short natural language sentence query and the hyperlinking task involves finding links from the target video segments to other related video segments in the collection using a set ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1707.00251 شماره
صفحات -
تاریخ انتشار 2017